International Journal of Science and Engineering Applications 
Volume 12-Issue 07, 220 — 222, 2023, ISSN:- 2319 - 7560 
DOI: 10.7753/IJSEA 1207.1061 


Intelligent Software for Multi-Ethnic Spectrum-Assisted 
Vocal Music Teaching Based on Intelligent Audio 
Classification Algorithm 


Yuejuan Lai 
Nanhai Conservatory of Music 
Haikou University of Economics 
Haikou, Hainan, 571127, China 


Abstract:Aiming at the currently used parameters are mainly static features, a dynamic feature calculation method based on 
information theory is proposed, and the initial value in key frame extraction is set according to its physical meaning, and the A 
algorithm is used to decompose the tensor, and then a tensor-based model is proposed. classifier. The experimental results show that 
the characteristics of the tensor model have a certain improvement effect on the problem of violent audio classification. National vocal 
music is the concentrated expression form of the Chinese national characteristic culture, showing many characteristics and potentials. 
Recognizing the strengths and weaknesses of each other's culture establishes a theoretical foundation, finds a new theoretical fulcrum 
for re-evaluating the value of different music cultures fairly and reasonably, and provides the necessary international quality for 


cultivating talents in the new century. 
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1. INTRODUCTION 


With the widespread application of multimedia and Internet 
technologies, there is an increasing demand for efficient audio 
data retrieval [1]. Audio classification is an important method 
to extract structured information and semantic content in 
audio, and it is the basis of audio understanding, analysis and 
retrieval. Audio classification mainly includes two basic 
aspects: feature extraction and classification [2]. In terms of 
feature extraction, it has been widely used in artificial 
intelligence fields such as image recognition, but it is still 
rarely used in the field of audio sentiment analysis [3]. 


However, some studies have shown that the VGGish network 
can extract more comprehensive and complex features of the 
data, which can lay a good foundation for intelligently 
analyzing the emotion of audio information [4]. The 
traditional method is to manually review the audio and video 
uploaded by the user. Due to the large amount of network 
multimedia, the manual method will waste a lot of manpower. 
Therefore, an algorithm is needed that can automatically 
identify violent content [5]. Audio is an important part of 
multimedia information. The traditional method is to 
manually review the audio and video uploaded by users. 
However, due to the large amount of online multimedia, the 
manual method will waste a lot of manpower. Therefore, an 
algorithm is needed that can automatically identify violent 
content [6]. 


Audio is an important part of multimedia information. Our 
country is a multi-ethnic country, and excellent traditional 
culture has a long history. As a form of China's excellent 
traditional culture, national vocal music is inheriting and 
promoting folk songs and rap in various places [7]. With the 
continuous improvement of the level of education 
informatization, intelligent teaching has gradually become a 
trend, and the integration of information technology into vocal 
music classroom teaching also become inevitable [8]. Based 
on the premise that the country vigorously promotes the 
reform policy of music teaching curriculum in primary and 
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secondary schools, this research has very important theoretical 
and practical significance [9]. 


Since the formation and promotion of the concept of 
multiculturalism, the traditional music teaching model has 
been greatly affected, and has been unable to meet the 
teaching standards of music courses [10]. The current era is an 
era in which industrial civilization is turning to post-industrial 
civilization. The era of modern knowledge-based 
transformation; the impact of the fourth industrial revolution 
has had a profound impact on traditional music production 
methods, teaching methods and learning methods, and the 
new teacher-student relationship and teaching concepts are 
facing reconstruction [11]. 


The oppression of cultural identity by the wave of 
globalization. In order to allow users to efficiently access the 
specific segments retrieved, it is necessary to organize the 
management of audio data and perform appropriate encoding 
format conversion, and establish an application-oriented, 
encoding Unified [12], easily searchable audio library. 
However, the amount of audio data is often very large. When 
the audio signal changes with time, information changes must 
occur between frames, so this paper uses information theory 
to calculate dynamic features [13]. In terms of classification, 
the vector space model is a classic method in information 
classification, but in the similarity calculation, only the 
matching information of the weight vector is considered, and 
the relevant information between the feature items is ignored 
[14]. 


Since most of the features of audio signals are extracted based 
on frame granularity, for each sample. The extracted original 
feature is a matrix composed of the feature sequence of the 
frame [15]. Traditional methods often need to convert 
matrices into vector features for classification. Compared with 
other traditional methods, the classification algorithm has 
higher classification accuracy, faster calculation speed, is 
more suitable for processing large amounts of data, and is 
more intelligent, such as electrocardiographic signals [16]. As 
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a treasure of traditional Chinese music culture, ethnic vocal 
music is becoming more and more important. more attention. 
While the teaching of ethnic vocal music in colleges and 
universities is developing rapidly, there are also many 
problems. Therefore, in the perspective of multiculturalism, 
we must carry out targeted teaching reforms to promote better 
and faster development of ethnic vocal music [17]. 


The construction of intelligent teaching requires that vocal 
music teachers in higher vocational schools can make full use 
of various forms of wisdom to further enrich the teaching 
content of vocal music courses, and form a new teaching 
ecology that is more intelligent and more in line with the 
needs of students, so as to promote the development of 
students' innovative thinking and improve the vocal music 
classroom. teaching effectiveness [18]. Since most of the 
features of audio signals are extracted based on frame 
granularity, for each sample. The extracted original feature is 
a matrix composed of the feature sequence of the frame. 
Traditional methods often need to convert matrices into vector 
features for classification [19]. 


2. THE PROPOSED METHODOLOGY 
2.1 The MFCC 


The audio stream is divided into frames, and each frame 
signal is calculated to obtain a 13-dimensional feature vector, 
including 12-dimensional MFCC coefficients and 1- 
dimensional dynamic features. K-means is used to cluster the 
feature space, and the initial centroid is selected according to 
the physical meaning of the audio dynamic features, that is, 
the entire frame feature vector of the audio is quickly scanned. 
For a given number of components, from the effect point of 
view, the alternating least squares (Alternating Least Square) 
is a relatively efficient algorithm. 


A large number of experiments have proved that the ALS 
algorithm has a good trade-off between the computational cost 
and the quality of the result, and is easy to implement, 
guarantees convergence, and is easy to extend to high-order 
tensors. ANN can process some environmental information 
that is very complex, the background knowledge is unclear, 
and the samples have The problem of pattern recognition with 
large defects or distortion is very suitable for classifying radar 
signals, but due to its long training time and poor real-time 
performance. The design idea is realized. It provides a high- 
throughput, scalable, and scalable distributed file system. It 
can be quickly deployed on cheap servers. It is a highly fault- 
tolerant distributed system. It relies on the redundancy 
mechanism between nodes to perform data backup and 
recovery and provides high-throughput data storage functions. 
It is suitable as the storage basis for massive data. In VSM, 
the TFIDF formula is used to calculate the weight of 
keywords, that is, the product of word frequency and inverse 
document frequency. 


In the audio frame vector sequence, the frame frequency 
cannot completely and accurately measure the influence of the 
key frame. When the number of clustered frames in which the 
key frame is located, that is, the greater the frequency, the 
greater the effect of the key frame on distinguishing audio. 
Considering the large amount of data and it is not meaningful 
to retain the characteristics of each frame, here The 
eigenvectors of adjacent frames are averaged, so that the 
obtained eigenmatrix can more accurately express timing 
information and is more meaningful. In the actual 
transmission of the signal, the number of training samples 
obtained is very limited. At this time, it is difficult for many 
methods to achieve the ideal classification effect. Even in the 
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case of limited training samples, the use of a complex learning 
machine can make the learning error smaller, but 
generalization feature is often worse. 


2.2 The Multi-Ethnic Spectrum Assists 


Vocal Music Teaching 

My country has a vast territory, and under the colorful natural 
environment and cultural style, the vocal art of various 
regions presents the characteristics of distinct personality and 
multi-layered diversity, and the singing methods of each 
ethnic group have certain differences. Therefore, the content 
of ethnic vocal music teaching Also involved in a wider 
range. In the current vocal music teaching work, teachers 
must rely on rich and diverse digital teaching resources to 
carry out intelligent reform of teaching methods. However, 
from the survey of the current degree of digitization of vocal 
music teaching resources, it is found that there are only less 
than 40% of the existing resources. At the same time as the 
development of global integration is gradually deepening. 


In a modern society, China must constantly make changes, so 
as to better cater to the changes in the world's cultural 
situation. The driving force for the development of popular 
music lies in the commercial operation. Coupled with the 
influence of various factors, it affects the development and 
inheritance of traditional folk music culture. The rise of 
multicultural music education is closely related to the 
multicultural trend of thought in the world. Although "cultural 
diversity" as a cultural phenomenon has a long history in 
human society, "multiculturalism" as a social thought started 
in the early 20th century. Since the beginning of the new 
century, due to the proliferation of separatism and populism. 
Heavy. Calculate the class pattern of a certain type of audio 
set D in the entire training set, form a feature space of the 
frame sequence of all audio files of this class, perform 
clustering to obtain the key frames of this type of audio file, 
and calculate the frame distribution information as the weight, 
so we get A class pattern of , which represents this class of 
audio. 


Chinese national vocal music is the rich accumulation and 
artistic expression of traditional Chinese culture. It is deeply 
loved by the public for its clear rhythm, beautiful melody and 
sincere emotion. Diversification has gradually become the 
main direction of the development of music culture. Teachers 
should analyze reports in real time according to their learning 
situation, generate dynamic teaching content, and accurately 
match high-quality digital resources to students’ learning 
needs, so as to form specific teaching courses. How to 
organize these materials becomes the problem, and this is 
where the organization structure of the content arises. Because 
of the existence of subordination and logical relationship 
within the subject knowledge itself. 


2.3 The National Spectrum Assisted Vocal 


Teaching Smart Software Design 

The correct recognition rate for the three signals is close to 
0.99, so the classification algorithm has a good generalization 
ability and overcomes the problem of excessive dependence 
on the model. The radial basis kernel function or polynomial 
kernel function of different parameters has no obvious effect 
on the performance of the algorithm. Influence. We must open 
our horizons, not blindly impart various techniques of 
Western singing to students, but truly achieve mutual 
reference and communication between the various branches of 
ethnic singing and Western singing, and create a college 


221 


International Journal of Science and Engineering Applications 
Volume 12-Issue 07, 220 — 222, 2023, ISSN:- 2319 - 7560 
DOI: 10.7753/JSEA 1207.1061 


ethnic group that belongs to traditional Chinese culture and 
art. 


The road to teaching vocal music. From the above process of 
rhythm training and teaching, it can be seen that once rich 
resources and accurate data analysis tools are available, the 
diversified application and precise adaptation of existing high- 
quality teaching forms can be realized in the development of 
multicultural music teaching. In the process, the daily study 
life and music teaching should be integrated. While teaching 
traditional music knowledge, some other music elements 
should be integrated to enrich the teaching content. Huang 
Zhen (0) expounded on the application of multicultural music. 
In the history of education development, there are various 
educational theories, such as humanism, behaviorism, 
eternalism, progressivism, constructivism, critical theory, etc. 
There are different philosophical positions behind these 
theories. Such as the rationalist philosophy behind eternalism, 
the mechanical materialism behind behaviorism, etc. 


There are many classification algorithms for data mining, 
such as decision tree, regression analysis, Bayesian, neural 
network, support vector machine, etc. Curriculum setting is 
the basis and orientation of teaching. To build a perfect 
Chinese national vocal music teaching system, make the 
teaching content meet the requirements of Chinese national 
vocal music teaching reform. Then 10 students independently 
carried out the corresponding knowledge learning according 
to the requirements of this module, and participated in the 
learning of other modules according to the plan, and finally 
completed a practical teaching exercise for preschool children 
according to the learning plan. 


3. CONCLUSIONS 


The improved VSM algorithm proposed in this paper can 
better obtain the key information in the data. It adopts the 
sequence of key frames and weights, and preserves the 
relationship between the audio auditory characteristics. The 
relevant information of the audio frequency is preserved, the 
tensor features are constructed, the feature matrix of each 
sample is projected and dimensioned, and a classification 
method based on the tensor model is proposed. The most 
important task of national vocal music teaching is to establish 
a sound education system to fully reflect the national 
characteristics. College teachers need to closely link ethnic 
culture and vocal music teaching, only by accurately grasping 
the characteristics of students, combining advanced ethnic 
music teaching ideas and teaching theories, and highlighting 
the pertinence of teaching. 
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